Effective Ranking of XML Keyword Search Results (Extended Version)

نویسندگان

  • Arash Termehchy
  • Marianne Winslett
چکیده

The popularity of XML has exacerbated the need for an easy-to-use, high precision query interface for XML data. When traditional document-oriented keyword search techniques do not suffice, natural language interfaces and keyword search techniques that take advantage of XML structure make it very easy for ordinary users to query XML databases. Unfortunately, current approaches to processing these queries rely heavily on heuristics that are intuitively appealing but ultimately ad hoc. These approaches often retrieve false positive answers, overlook correct answers, and cannot rank answers appropriately. To address these problems for data-centric XML, we propose coherency ranking, a domainand database design-independent ranking method for XML keyword queries that is based on an extension of the concepts of data dependencies and mutual information. With coherency ranking, the results of a keyword query are invariant under schema reorganization. We analyze the way in which previous approaches to XML keyword search approximate coherency ranking, and present efficient algorithms to process queries and rank their answers using coherency ranking. Our empirical evaluation with two real-world XML data sets shows that coherency ranking has better precision and recall and provides better ranking than all previous approaches. Coherency ranking can also be used for keyword queries over relational and graph data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword Search in XML Database with Relevance Ranking and Maintaining Stored Websites

Keyword search in XML Database is to provide access on XML database by overcoming keyword ambiguity. Here users are allowed to search on XML database using keyword search like Text Databases. A novel IR style approach which well captures XML’s hierarchical structure, and works well on pure keyword query independent of any schema information of XML data. A search engine prototype called XReal is...

متن کامل

SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents

Keyword search in XML documents has recently gained a lot of research attention. Given a keyword query, existing approaches first compute the lowest common ancestors (LCAs) or their variants of XML elements that contain the input keywords, and then identify the subtrees rooted at the LCAs as the answer. In this the paper we study how to use the rich structural relationships embedded in XML docu...

متن کامل

Exploiting ID References for Effective Keyword Search in XML Documents

In this paper, we study novel Tree + IDREF data model for keyword search in XML. In this model, we propose novel Lowest Referred Ancestor (LRA) pair, Extended LRA (ELRA) pair and ELRA group semantics for effective and efficient keyword search. We develop efficient algorithms to compute the search results based on our semantics. Experimental study shows the superiority of our approach.

متن کامل

Demonstrating Effective Ranked XML Keyword Search with Meaningful Result Display

In this paper, we demonstrate an effective ranked XML keyword search with meaningful result display. Our system, named ICRA, recognizes a set of object classes in XML data for result display, defines the matching semantics that meet user’s search needs more precisely, captures the ID references in XML data to find more relevant results, and adopts novel ranking schemes. ICRA achieves both high ...

متن کامل

XIOTR : A Terse Ranking of XIO for XML Keyword Search

The emergence of the Web has increased interests in XML data because that XML has flexible structure. Keyword search has attracted a great deal of attention for retrieving XML data because it is a userfriendly mechanism. But Keyword search is hard to directly improve search quality because lots of keyword-matched nodes may not contribute to the results. And in many applications, the goal is to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009